,  CA  939*0 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


THESIS 


PREDICTION  OF  NAVY 

E-4  TEST 

PASS 

ERS 

by 

Edwin  Frankl 

in  Beach 

September 

1979 

■ 

Th 

esis  Advisor: 

R. 

R. 

Read 

Approved  for  public  release;  distribution  unlimited 


T189161 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  OF  THIS  PAX5E  (Whon  Cata  Entarad} 


REPORT  DOCUMENTATION  PAGE 


i.  ReFoSr  number 


2.   OOVT    ACCESSION   NO. 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


S.     RECIPIENT'S  CATALOG  NUMBER 


4.     TITLE  Cnd  Submit) 

Prediction  of  Navy  E-4  Test  Passers 


S.     TYRE  OF    REPORT    i    RERIOO   COVERED 


Master's  Thesis; 
September  1979 


1.  PERFORMING  ORG.  REPORT  NUMBER 


7.  AuTHORfJJ 


t.  CONTRACT  OR  GRANT  NUMSERf"; 


Edwin  Franklin  Beach 


t.     PERFORMING  ORGANIZATION   NAME   AND  ADDRESS 

Naval  Postgraduate  School 
Monterey  CA  939^0 


10.     PROGRAM   ELEMENT.  PROJECT,    TASK 
AREA   4    WORK   UNIT   NUMBERS 


II.     CONTROLLING  OFFICE   NAME   ANO   AOORESS 

Naval  Postgraduate  School 
Monterey  CA  939^0 


12.  REPORT  DATE 


September  1979 


13.     NUMBER  OF   PAGES 

55 


14.     MONITORING  AGENCY  NAME  a   AOORESSf//  dlllmrmnl  tram  Controlling  Ollleo) 


Naval  Postgraduate  School 
Monterey  CA  939^0 


IS.     SECURITY   CLASS,   (ol   iMm  raport; 

Unclassified 


Ha.     DECLASSIFICATION/  DOWN  GRADING 
SCHEDULE 


IS.     DISTRIBUTION   STATEMENT  (ol  thlt  Rbpoti) 


Approved  for  public  release;  distribution  unlimited ■ 


17.     DISTRIBUTION  STATEMENT  (ol  thm  aaa«ra«  mntond  In  Bleak  30,  //  dlllmrmnl  tram  Kmpon) 


It.     SUPPLEMENTARY   NOTES 


1S.     KEY  WOROS  rConi/nua  ob  rmvmrtm  tldm  it  nmcmmmmrr  mnd  idaniilr  by  block  nammmt) 


20.     ABSTRACT  'Coniinum  on  rovwrmo  tldm  II  nmemtmmrr  and  Idmnlitr  or  black  mmmbmr) 

This  thesis  applies  hierarchical  clustering  and  quadratic 
discriminant  function  techniques  to  the  problem  of  predicting 
E-4  test  passers  and  non-test  passers  (including  non-test  takers) 
in  the  Navy.   The  biographic  data  base  includes  items  such  as 
test  scores  and  education  to  serve  as  separators  and  predictors 
in  the  techniques. 


'omm     1473 

1    JAN   71      IRW  J 


DO 

(Page    1) 


EDITION  OF    I   NOV  «•  IS  OBSOLETE 
S/N    0  102-014-  S«01   i 


UNCLASSIFIED 


SECURITY   CLASSIFICATION   OF    THIS   PAGE  (••tlan  Oa«a  Enimrmd) 


UNCLASSIFIED 


(teuiTv  CL*llirie*Tio»j  o»  Tan  mtwi.  r>«..  inwitf. 


(20.   ABSTRACT  Continued) 

The  clustering  of  rates  permitted  accumulation  of  personnel 
in  the  lightly  staffed  ratings  into  similar  groups  of  substantial 
size,  was  objective,  and  may  be  useful  for  purposes  other  than 
the  present  one.   The  discriminant  analysis  produced  correct 
classification  rates  of  about  60  to  70  percent  based  on  the  data 
at  hand. 


DD  Form   1473 

L  1  Jan  73 

S/N  0102-014-6601 


J  5M.""3.."ir„  UNCLASSIFIED 


wcu«i»»  ecAMiritATioM  o»  tmh  o»«i^«  o« 


Approved  for  public  release;  distribution  unlimited 


PREDICTION  OF  NAVI  E-4  TEST  PASSERS 


by 


Edwin  Franklin  Beach 
Lieutenant.  United  States  Navy 
3.S.,  New  Mexico  Institute  of  Mining  and  Technology,  1971 


Submitted  in  partial  fulfillment  of  the 
requirements  for  the  degree  of 


MASTER  OF  SCIENCE  IB  OPERATIONS  RESEARCH 


from  the 
NAVAL  POSTGRADUATE  SCHOOL 


September  1 97  9 


V 

E  SCHOOL 
MONTEREY,  CA  93940 


ABSTRACT 


This  thesis  applies  hierarchical  clustering  and 
quadratic  discriminant  function  techniques  to  the 
problem  of  predicting  E-4  test  passers  and  non-test 
passers  (including  non-test  takers)  in  the  Navy.  The 
orographic  data  base  includes  items  such  as  test 
scores  and  education  to  serve  as  separators  and 
predictors  in  the  techniques. 

The  clustering  of  rates  permitted  accumulation  of 
personnel  in  the  lightly  staffed  ratings  into  similar 
groups  of  substantial  size,  was  objective,  and  may  be 
useful  for  purposes  other  than  the  present  one.  The 
discriminant  analysis  produced  correct  classification 
rates  or  aoout  50  to  70  percent  based  on  the  data  at 
hand. 


TABLE  OF  CONTENTS 


I.   INTRODUCTION . 6 

II.   DISCUSSION  OF  DATA . 3 

III.   DEFINITION  OF  GROUPS 11 

17.   APPROACHES  TO  ANALYSIS 14 

7.   TECHNICAL  DETAILS  OF  PROCEDURES  USED 19 

VI.   RESULTS  OF  ANALYSIS 27 

VII.   CONCLUSIONS 33 

Appendix  A;    LIST  OF  DETERMINING  FACTORS 34 

Appendix  3:    K  GROUP  TEST  FOR  EQUALITY  OF  GROUP  MEANS  .  35 

Appendix  C:    K  GROUP  CLASSIFICATION  FUNCTION 38 

Appendix  D:    TESTS  OF  EQUALITY  OF  GROUP  MEANS 39 

Appendix  E:    POWER  OF  DETERMINING  FACTORS 41 

Appendix  F:    RESULTS  OF  CLUSTERING  OF  SEAMEN  RATING....  47 

Appendix  G:    CLASSIFICATION  RESULTS 49 

Appendix  H:    CLUSTERING  TREE  FOR  SEAMEN  RATING 51 

LIST  OF  REFERENCES 53 

INITIAL  DISTRIBUTION  LIST 54 


INTRODUCTION 


In  these  days  of  increasing  costs,  the  Navy  must  ensure 
it  is  recruiting  and  retaining  the  best  possible  personnel. 
Enlistees  should  not  only  have  tha  capacity  to  do  the  work, 
but  also  the  desire  and  drive  to  accomplish  tha  assigned 
tasks.  Currently,  there  are  few  measures  of  effectiveness 
which  indicate  how  an  enlistee  will  perform  in  the  Navy. 
The  SCREEN  score,  developed  by  Dr.  Robert  Lockman  of  the 
Center  for  Naval  analyses,  predicts  attrition  of  first  term 
recruits  ,as  a  function  of  test  score  and  biographic  data 
I  1  I .  It  does  not,  however,  specifically  address  the 
performance  of  a  member  while  in  the  Navy;  rather  it 
predicts    survival   in    the   Navy. 

The  advancement  to  2-4  is  a  possible  measure  of 
effectiveness  with  which  a  member's  utility  to  the  Navy 
could  oe  judged.  First,  the  member  must  display  a  certain 
amount  of  initiative  and  drive  to  prepare  for  the  exam. 
Second,  passing  the  rest  is  an  indication  that  a  member  has 
reached  a  basic  level  of  knowledge  within  a  specific  rating. 
It  is  at  this  point  that  the  Navy  begins  to  get  a  return  on 
its  investment  in  the  member.  The  S-'4  test,  therefore,  may 
be  used  as  a  measure  of  both  a  member's  capacity  and  desire 
to    perform   in   the    Navy. 

This  thesis  explores  the  differences  between  the  E-4 
test  passers  and  non-test  passers  based  on  the  individual's 
biographic  and  demographic  data.  One  would  like  to  know 
what  variables  in  a  member's  background  are  associated  with 
his  ability  and  desire  to  pass  the  exam.  Specifically,  an 
ordered    list    of   determining    factors   is   desired.       also,    it   is 


important  to  learn  if  tie  determining  factors  are  identical 
for  all  ratings  or  if  they  vary  from  rating  to  rating. 
Finally,  it  would  be  iateresting  to  discover  how  length  of 
time  in  service  and  time  in  rate  are  related  to  a  member's 
tendency    to    advance. 

For  the  purposes  of  this  analysis,  a  test  passer  is 
defined  as  a  member  who  has  taken  the  E-4  exam  and  received 
a  score  which  caused  him  to  be  advanced  to  E-4.  A  non-test 
passer  is  a  member  who  his  either  taken  the  E-4  exam  and  did 
not  receive  a  high  enough  score  to  be  advanced  or  who  did 
not  take  the  exam  even  though  he  was  eligible.  Most  of  the 
non-test  passers  in  the  study  are  non-test  takers.  The 
separation  of  the  non-test  takers  from  the  non-test  passers 
was  not  possible  due  to  the  lack  of  data.  The  reader  will 
need  to  keep  these  definitions  in  mind  while  reading  the 
text.  For  clarity,  a  graphical  presentation  of  these 
definitions    may    be    found   in   Figure    1    in   Chapter   til. 

The  data  used  in  this  study  come  from  all  personnel 
eligible  to  take  the  S-'4  exam  in  August  1977.  A  member  was 
considered  eligible  if  he  had  at  least  one  year  in  the  Navy 
and   at   least   six    months   is    an   E-3. 

Two  major  results  were  obtained  from  this  study.  First, 
the  use  of  non-linear  discrimination  resulted  in  correct 
classification  of  60  -  10%  of  the  members  of  each  sample 
examined.  Second,  hierarchical  clustering  was  applied  to 
group  the  ratings  (especially  those  with  small  numbers  of 
aemoers) .  These  groupings  may  have  general  use  beyond  those 
of    this    paper. 


II.   DISCUSSION  OP  DATA 


There  are  ma  ay  possible  factors  which,  may  affect:  a 
member's  desire  and  anility  to  pass  the  3-4  exam.  Examples 
of  these  are  years  of  education,  job  satisifaction , 
leadership  of  superiors,  sea/shore  assignment,  ate.  Data 
for  some  of  the  more  appealing  of  taasa  was  unavailable  far 
the  study.  The  factors  used  in  this  analysis  iiaca  all  drawn 
from  the  August  1977  monthly  Enlisted  Master  Record  (EMS) 
file  located  at  the  Naval  Military  Personnel  Center  in 
Arlington,  Virginia.  The  month  of  August  was  chosen  because 
it  coincided  with  a  month  in  which  the  E-4  exams  were  given. 

The  EM2  is  a  3000  character  record  wnich  is  maintained 
on  each  active  duty  and  reserve  enlisted  member  of  the  Navy. 
Such  items  as  social  security  number,  pay  entry  oase  date, 
schools  attended,  and  rest  scores  are  stored  on  each  record. 
Each  record  is  a  condensed  version  of  a  member's  personnel 
jacket. 

To  maxe  the  data  easier  to  manipulate,  the  3000 
character  record  file  was  reduced  to  a  150  character  record 
fiie.   Each  record  contained  the  following  demographic  data.: 


Age 
Sex 

Race 

Ethnic  group 

Home  of  recocd 

Dependants 

Time  in  rate 

Active    duty/Seserve    duty    indicator 


Length  of  service 

End  of  active  duty  obligation  date 

Enlistment  term 

lears  of  education 

Education  cectif ication 

A-School/No  A-School  indicator 

Special  test  scores 

SCREEN  score 

Hental  aptitude  test  scores  (A5VAB  or  BI3) 

The  special  test  scoras  measure  aptitude  in  the  areas  of 
sonar,  electronics,  and  radio.  rha  SCREEN  score  is  a 
measure  of  a  recruit's  chances  of  completing  the  first  two 
years  of  his  enlistment.  The  mental  aptitude  tests  are  a 
measure  of  the  member's  overall  intelligence-  The  Basic 
Test:  Battery  (BIB)  consists  of  five  exams  in  tha  areas  of 
general  intelligence,  numerical  reasoning,  and  mechanical, 
clerical,  and  shop  aptitude.  The  Armed  Services  Vocational 
Aptitude  Battery  (A5VAB)  contains  sixteen  tests  measuring 
specific  areas  of  intelligence  and  aptitude,  including  trie 
ones  in  the  3TB.  Unfortunately,  the  ASYA3  had  not  been 
given  to  a  large  enough  sample  of  the  population  to  allow  it 
to  be  used  in  the  analysis. 

From  the  data  set,  the  individuals  were  extracted  whose 
length  of  service  was  at  least  one  year,  and  whose  time  in 
rate  was  at  least  six  months.  These  were  considered  the 
members  eligible  to  take  the  E-4  exam.  Since  completion  of 
correspondence  courses  and  practical  factors  is  not 
reflected  in  the  EMR,  it  was  not  possible  to  check,  these 
eligibility  requirements.  However,  it  was  assumed  that  if  a 
member  fulfilled  the  time  requirements  and  desired  to  ta^e 
the  exam,  he  would  have  his  other  reguiraments  completed. 

The   variables   of   the   data   were   originally  in  three 


scales;  binary,  nominal,  and  interval*  Nominal  data  are 
data  in  which  numbers  are  used  as  labels  or  "names",  e.g., 
3lack=1,  Chicano=2,  White=3.  Binary  data  were  a  special 
form  of  the  nominal  where  only  two  responses  are  available. 
Interval  data  include  all  continuous  variables  where 
differences  between  scores  are  meaningful.  For  example, 
psychological  test  data  ace  treated  as  interval  data.  All 
of  the  nominal  data  were  converted  to  binary  values  in  order 
to  meet  the  assumptions  of  the  analysis.  For  instance,  the 
educational  certification  was  transformed  from  the 
responses:  no  degree,  GED,  high  school  diploma.  Bachelor's 
degree,  and  postgraduate  degree,  to  high  school  diploma  or 
no  high  school  diploma.  The  GED  and  college  degree  holders 
were  grouped  with  the  high  school  diploma  holders. 

Some  variables  were  not  used  in  parts  or  all  of  the 
analysis  oecause  it  was  found  there  were  not  enough  members 
in  the  sample  with  values  for  these  variables.  Foe  example, 
sex  could  not  be  used  in  analyzing  the  3oilar  Technician 
rating  oecause  there  were  no  female  memoers  in  the  sample 
population.  Likewise,  the  A57AB  scores  could  not  be  used 
oecause  not  enough  of  the  sample  population  had  taken  this 
test.  Some  of  the  other  determining  factors  were  not  used 
oecause  it  aas  decided  at  the  outset  that  the  variable  could 
not  be  used  in  policy  decisions.  Home  of  record  is  an 
example  of  this  type  of  variable.  All  variables  used  in  tne 
analysis  with  a  description  of  their  types  and  ranges  are 
listed  in  Appendix  A. 


10 


III.   DEFINITION  OF  GROUPS 


Due  to  the  non-availability  of  data  listing  all  test 
takers  and  test  passers  for  the  August  1977  cycle,  a 
comparison  of  the  Augast  1977  EMH  and  the  August  1978  SMR 
was  necessary  to  determine  who  passed  the  exam.  The  S-4 
exam  is  given  only  twice  a  year  and  all  test  passers  from  a 
given  cycle  who  are  advanced  to  B-4  are  advanced  from  three 
to  nine  months  after  the  cycle.  Therefore,  if  a  member  was 
eligible  to  take  the  exam  according  to  the  August  1977  file 
and  was  promoted  betweei  December  1977  and  May  1978,  he  was 
classified  as  a  test  passer.  If  he  was  eligible  to  take  the 
exam  in  August  1977  but  he  was  not  promoted  during  the 
advancement  period,  he  wis  classified  as  a  non-test  passer 
regardless  of  whether  or  not  he  took  the  exaa.  This  is 
illustrated  in  the  Figure  1.  Finally,  if  he  was  eligible  to 
take  the  exam  in  August  1977  and  was  not  in  the  August  1978 
file,  he  was  listed  as  out  of  service  because  it  was  not 
possible  to  determine  if  he  passed  the  exam  or  not. 


11 


SHADED  AREA 
INCLUDES  ALL 
NON-TEST  PASSERS 
AS  DEFINED  ABOVE 


NON-TEST 
TAKERS 


Graphical   Represents  tion   of   Test    Passers   and    Non-tesl 

Passers 

Figure  1. 


Once  the  groups  were  determined,  it  was  possible  to 
evaluate  the  data  by  either  of  two  methods.  First,  since 
the  out  of  service  member  was  in  the  Navy  at  the  time  the 
test  was  given,  he  was  actually  a  test  passer  or  a  non-test 
passer.  The  assumption  would  be  made  that  the  reason  he 
left  the  service  was  that  he  completed  his  time  in  service 
requirements.  Therefore,  time  alone  would  distinguish  the 
out  of  service  nember  from  the  test  passer  and  the  non-test 
passer.  If  all  factors  involving  time  in  the  Navy  such  as 
length  of  service  and  time  in  rate  are  ignored,  then  the 
test  passer  and  non-test  passer  groups  may  be  looted  on  as  a 
random  sunset  of  the  original  test  passer  /  non-test  passer 
/  out  of  service  data.  This  gives  just  two  all  inclusive 
groups  as  they  were  found  on  the  day  of  the  test. 
tlnf ortunately,  one  is  forced  to  ignore  military  time  related 
factors      [variables       13,    14,    and    15,     Appendix    A]     which    might 


_L2_ 


prove  to  be  useful.  Evei  so,  this  approach  might  well  be 
considered  useful  in  evaluating  enlistees  who  as  yet  have  no 
time  in  service. 

The  second  method  allows  the  inclusion  of  military  time 
dependent  factors.  In  this  approach  the  data  are  viewed  at 
the  test  cycle  plus  one  year  position  (August  1978).  From 
this  time  perspective,  a  member  may  fall  into  only  one  of 
the  three  groups;  test  passer,  non-test  passer,  or  out  of 
service.  5ince  all  members  are  included  in  this  approach 
and  no  assumptions  of  random  suosets  are  made,  the  time 
dependant  factors  may  be  used.  although  classification  of 
out  of  service  personnel  may  at  first  seem  extraneous  to  the 
problem,  it  is  useful  knowledge,  and  may  in  fact  be  as 
important  as  identifying  the  test  passers. 

3oth  of  the  above  methods  of  grouping  may  be  applied  at 
various  levels  of  the  Havy's  structure.  First  they  may  be 
used  at  the  all  Navy  level.  Here  all  E-3's  who  meet  the 
time  requirements  for  E-4  would  be  considered,  regardless  of 
rate.  Next,  the  individual  ratings  may  be  investigated. 
Before  a  member  may  take  an  E-4  exam  in  a  particular  rating, 
he  must  complete  certain  study  and  practical  factors 
requirements  for  that  rating.  Members  in  the  process  of 
doing  this  are  refered  to  as  designated  strikers  in  that 
rating  and  are  identified  on  the  EMS  file.  By  analyzing 
only  the  designated  strikers  in  a  specific  rate,  it  may  oe 
possible  to  predict  the  test  passers  for  that  rate. 
Finally,  for  the  sample  available,  certain  ratings  had  too 
few  designated  strikers  to  give  any  conclusive  results.  To 
handle  these  cases,  hierarchical  clustering  methods  could  be 
used  to  group  ratings  according  to  similarity  of  specified 
attributes  of  the  ratings.  Each  group  of  ratings  could  then 
be  subjected  to  analysis. 


13 


IV.   APPROACHES  TO  ANALYSIS 


The  data  used  in  tills  analysis  consisted  of  fifteen 
variables  per  member.  The  intent  was  to  divide  the  members 
into  groups  according  to  these  variables.  Hence,  the  use  of 
some  form  of  multivariate  analysis  was  dictated. 

The  intent  of  tne  analysis  was  two-fold.  First,  it  was 
necessary  to  find  out  if  a  difference  existed  between  the 
groups  in  terms  of  the  available  data.  Second,  if  a 
difference  did  exist,  was  it  possible  to  obtain  a 
classification  scheme  which  would  correctly  classify  an 
acceptably  high  percentage  of  the  members? 

Discriminant  analysis  is  one  method  of  performing  the 
desired  analyses.  Discriminant  analysis  is  a  multivariate 
statistical  technique  used  for  constructing  decision  rules 
by  which  data  units  (enlisted  members)  may  be  assigned  to 
groups  to  which  they  have  the  greatest  resemblance.  These 
decision  rules  are  statistical  functions.  The  independent 
variables  in  the  functions  are  the  attributes  of  the  member. 
This  analysis  is  valid  for  both  tne  two  group  and  the  three 
group  case. 

There  are  three  basic  assumptions  underlying 
discriminant  analysis  { 2J .  First,  the  groups  must  be 
discrete  and  identifibla.  Both  grouping  methods  defined  in 
the  last  section  meet  this  requirement.  Second,  each 
observation  in  each  group  can  be  described  by  a  set  of  a 
variaoles.  This  condition  is  met  uy  use  of  the  fifteen 
determining  factors.  Finally,  all  a  variaoles  are  assumed 
to  have  a  multivariate   normal   distribution.    All   of   the 


14 


interval  data  for  this  study  were  unimodal  and  symmetric. 
The  data  were  judged  to  aave  a  close  anough  approximation  to 
a  normal  distribution  to  meet  this  assumption.  The  binary 
data,  of  course  did  not.  However,  according  to  Gilbert  |3|, 
binary  data  may  be  used  with  little  loss  of  discriminating 
power.  This  is  because  the  rotations  of  the  axes  which 
occur  in  discriminant  analysis  cause  the  binary  data  to 
appear  to  be  binary  no  longer. 

As   an   example,   consider  a  two  space  problem  of  binary 
vs.  interval  data: 


Binary  vs.  Interval  Data 


Figure  2. 


If    tne   axis   is   rotated   to   obtain  a   discriminant 

function,  tne   data   no  longer   appear   to  be   binary  vs. 

interval,    but   interval   vs.   interval   as  shown   in   the 
following  figure. 


15 


Transformed  Binary  vs.  Interval  Data 


Figure  3. 


Given  that  all  the  basic  assumptions  foe  the  use  of 
discriminant  analysis  ace  satisified,  the  type  of 
discriminant  function  must  ne         chosen.  If         the 

variance-covariance  matrices  of  the  groups  are  equal,  a 
linear  function  may  be  used.  If  they  are  not  equal  tne 
better  discrimination  is  achieved  by  a  quadratic  function. 
A  two-dimensional  illustration  of  this  is  shown  in  tne 
following  diagram.  The  inequality  of  the  covariance 
matrices  is  reflected  bj  the  elliptical  contours  of  constant 
density. 


16 


Quadratic  vs.  Linear  Discriminating  Functions 


Figure  4. 


The  quadratic  function  aoove  misclassif ias  only  one  data 
point,  whereas  the  linear  function  misclassif ies  five 
points.  'When  given  a  choice,  even  if  tae 
variance-covariance  matrices  are  equal,  it  is  wise  to  choose 
the  quadratic  function.  The  quadratic  should  always  perform 
at  least  as  well  as  the  linear  function. 


As  was  stated  in  Chapter  III,  some  ratings  do  not  have  a 
large  enough  sample  in  any  testing  cycle  to  allow  one  to 
conduct  a  meaningful  analysis.  To  resolve  this  problem, 
cluster  analysis  t as  used  to  group  tae  ratings. 

There  are  several  methods  of  cluster  analysis  available. 
The  particular  type  used  in  this  study  is  termed 
hierarchical  clustering.  For  this  method,  the  data 
attributes  of  each  entity  (rating)  are  investigated  to 
determine  which  two  entities  are  the  most  alike  [H] .  These 
two  entities  are  clustered  together  to  form  a  new  entity  and 
an   averaged   set   of   variables   is   computed  for  the  newly 


17 


formed  cluster.  This  cycle  is  repeated  until  there  is  one 
final  cluster  containing  all  of  the  original  entities.  The 
results  can  be  represented  as  a  tree  with  all  the  original 
units  on  the  left  and  the  single  cluster  on  the  eight. 


Hierarchical  Clustering  Tree 


Figure  5. 


The  major  advantage  to  the  hierarchical  clustering 
methods  is  that  they  allow  one  to  view  the  overall 
relationships  among  the  data  units.  The  tree  diagram"  shows 
the  natural  clusters  which  occur  and  the  degree  of 
similarity  at  the  end  of  each  clustering  cycle.  From  this, 
the  analysr.  is  free  to  choose  the  optimal  number  of  clusters 
for  his  purposes. 


18 


TECHNICAL  DETAILS  OF  PSOCEDURES  USED 


Discriminant  analysis  was  applied  to  the  data  to 
determine  if  a  difference  existed  between  the  group  means 
and,  if  so,  to  construct  decision  rules  to  place  members  in 
the  proper  groups. 

The  first  step  in  the  analysis  was  to  determine  if  the 
variance-covariance  matrices  of  the  groups  were  equal.  The 
results  of  this  dictated  whether  or  not  to  use  a  linear  or 
quadratic  discriminating  function.  In  this  analysis,  every 
sample  produced  unequal  variance-covariance  matrices. 
Therefore,  the  quadratic  function  was  used  in  every  case. 

Next,  the  data  fron  each  group  were  investigated  to 
determine  if  a  difference  existed  between  the  means  of  the 
groups  in  terms  of  the  variables  being  investigated.  The 
derivation  of  the  two  group  case  is  presented  in  the 
following  paragraphs.  The  three  group  case  is  found  in 
Appendix  B. 

For  this  analysis,  the  groups  were  chosen  to  be  of  equal 

size  (N  =  N   =  H  )  and  the  variance-covariance  matrices  are 
1     2 

assumed  to  be  unequal.   M  observations  from  each  of  the  test 

passer  and  non-test  passer   groups   may   be   represented   in 

th 
vector   notation   as  follows,  where  the  n    observation  from 

each  group  is  a  1  xm  vector 


(* 


1,n 


,x   ) 

m,  n 


19 


where   n=1,   ...  ,N  and  a  is  the  numoer  or  variables  in  each 


observation. 


y   and  p   are  also  vectors  of  length  m. 
1       2 


u  •  ~  »  u .  •  ' 


IB  ,  1 


The  nail  hypothesis  for  this  section  of  the  analysis  is  that 
the  means  of  the  groups  in  terras  of  the  the  determining 
factors  are  equal  or 


u  =  y 
1     2 


The      standard      aethod      in      univariate         statistics         for 

determining      if      the       mean       of      a      population      with       unknown 

variance    is    equal    to    a    specific    value    is    the    t-test.         Given 

2 
that      the      population      has      a      8  {    \i ,     a   )     distribution,     the    t 

statistic   is   defined    as   follows 


t  =  (v/N  (x-y)  )  /S 


le  extention  of  the  t  statistic  to  multivariate  analysis  is 

2 
:he  I   statistic  which  is  defined  as 


2  -1 

T   =  N  (X-U)  'S    (X-M) 


Now    let    y      =    x         -x  and    N  =    N    +  N    .       y    is    defined    as 

i  i,  1      i,2  12 


y   =    VM  Z  (x        -x        )     =    x   -x 

i,1       i,2  1       2 


20 


and   tha    varianca-covaria  nca    matrix   is 


S    =    1/  (N-1)  Jjj.-y)   (Y.-7)  ' 


=    1/  (N-1)  T  (x         -x         -x    *x    )      (x         -x         -x    +X    )  ' 

i,1      i,2      1       2  i,1       i,2       1       2 


Tha  I   statistic  in  terms  of  y  is 


2        -1 

I   =  Ny  'S   y 


and  tha  noil  hypothesis  as  stated  above  can  be  rewritten  as 


H   :   n  -  u  =  0 
0      1     2 


to  accomodate  T   in  terms  of  y. 

2 

The  object  of  the  T   test  is  to  rind  a  confidence  region 

about 


li   and 


u   as   shown   below.    Tha   size   of   the 
2 


conridenca   region   is   dependant   on  a  ,   the    level    of 
significance  of  the  test. 


21 


Confidence  Region  About   u   -  y 

1     2 


Figure  6. 


With  a  scaling   change,   the   I    statistic   is   distributed 

according   to   in  F  distribution  with  a  and  S-m-1  degrees  of 
freedom  | 5|  . 


I   <v,  (N-1)  a  F  (a) 
M    N-a-  1    3-ai-1 


Therefore,    if 


2  a 

I       (N-m-  n      >      F       (a) 
H     fF=il  m       -        N-a-1 


the  hypothesis  that  the  aeans  are  equal  is  rejected. 

Once  the  inequality  of  the  aeans  has  been  established, 
the  decision  rules  can  be  constructed.  Again,  the  two  group 
case  will  be  dealt  with  here  and  the  three  group  case  can  be 
found  in  Appendix  C. 


22 


Anderson  states  that  in  addition  to  the  basic 
discriminant  function,  the  classification  function  must  take 
into  account  the  a  priori  probability  of  group  membership 
and/or  the  costs  of  misclassif icat ion.  Furthermore,  he 
states  that,  "the  good  classification  scheme  minimizes  the 
bad  effects  of  misclassif ication"  j5}.  As  a  means  of 
exploring  tills,  let  M  be  the  misclassif ication  function 
which  is  to  be  minimized. 


B  «  P(1|2)  TT    +  P(2|1)  TT 

2  1 


tt    is   the   a   priori   probability  of  an  observation  being 
h 

drawn  from   group   h,   and   P(g|h)   is   the   probability   of 

classifying  an  observation  as  a  member  of  group  g  when  it  is 
actually  a  member  of  group  h.  The  costs  of 
misclassif ication  could  be  included  but  are  unknown  and  are 
nor  dealt  with  in  this  study.  The  quantity  H  is  minimized 
by  the  following  rule:  assign  to  group  one  if 


£   (X)  /f   (X)  TT  /  TT 

1      2      _   2   1 


Otnerwise,  assign  to  group  two.   f  (x)   is   the   probability 

i 

density      function      of      population      i.       Assuming    multivariate 

normal         populations         with         unequal  variance-covariance 

matrices,    the   above    equation    becomes 


1/2  1/2    -1  -1 

[Utt)  CZ   1         1  exp-1/2C(X-u_i)  '     ^  (X-uJI  tt 

t/2 1/2  -i  =n  >  — 

rU11)    '     LE2J        1  exp-1/2j.{X-u   )  •  (x-u    )]  ir 


where  Z   is  the  variance.   Talcing  natural  logs  of  both  sides 
and  simplifying  yields 


23 


-1  „  -1  -1 

1/2    ln[]Z   .Z      3-V2UX-  U  )  '        s       (X-u)-(X-u     )'Z        (X-  u  )1 
2      1  1  1  1  2         2  2 


1     ir/ir, 


3y    rearranging   the    equation,    the   quadratic    decision    rule      is 
formed. 


-1 


-1 


-1 


-1 


-1 


-1 


X'  {Z         -   Z  )  X    -    2(u    '  Z         -    y    »  Z      )  X       ♦      u»E  u      -  u  »    E      \i 


1        1 


2       2 


11  1  2       2    2 


-1 


<    in  [  Z  •  E      1         ~    21nT  tt  / tt  1 
-21  21 


If  the  above  equation  is  true,  the  observation  should  be 
assigned  to  group  1.  If  not,  assign  the  observation  to 
group    2. 

It  is  also  possible  to  determine  the  contribution  that 
each  variaols  makes  to  the  discriminant  function.  That  is, 
it  is  possible  to  compute  the  relative  amount  of 
discriminatory   power    that    aach    varxable    has. 


th 
Let      V         oe      an      mx1       vector      of    coefficients    of    the    i 

J 

discriminating  function.   This  vector  is  transformed  into   a. 

scaled   vector   V   by  multiplying  each  element  of  v   by  the 

i  i 

square  root  of  the  corresponding   diagonal   element   of   tie 

pooled   within   groups   deviation   sum   of  squares  matrix  4, 
where 


'.    =    (r   -  x  .)  (i   -  x  .) 

i,r       gin    gx    gin    gi 


24 


The    scaled  vector    V    is 


i  i,  1       1,1      i,2         2,2 


l , a      m  ,m 


The   discrimination   power   of   each   variable   can   now   be 
expressed  as 


rr 


Li-l. 


i,: 


The  method  of  hierarchical  clustering  used  in  this  study 

is  due  to  Ward  £63  .   This  method  is  based  on  the  computation 

of  the  Euclidean  distance  between  centroids  of  the  entities. 

th 
For   this   discussion,   let   x     be   tne   value  of  the  i 

ijk 

th  th 

variable    of    the    j         data    unit    in   the    k        cluster.       The      mean 

th  th 

of    the    i         variable    in    the    It        cluster    is 


x         =    1/m      x 

ik  k      ijk 


and   the    error   sum    of    squares    for  cluster    k    is 


E      =  (x  -    x      ) 

k  ijk  ik 


The    total    within    group    error    sum    of    squares    i; 


e  =  Ye 


25 


The  criteria  for  the  Hard  method  is  the  minimization   of 
the  increase  in  the  within  group  error  sum  of  squares. 


AE   =  e  -  e  -  e 

pq     t     p     q 


where  E      and  E   are  the  within  group  arror   sum   of   squares 

p     q 

for    the    two    entities    which    are    joined    to    form    the    cluster    t. 
3y   simplification 


AE         =    m    m 
PI  P   q 

p    a 


lp       xq 


This  function  should  force  the  entities  to  group  into  tight 
clusters.  Also,  this  should  cause  the  distance  between  the 
clusters  to  he  at  a  maximum. 


26 


VI 


RESULTS  0?  ANALYSIS 


The  discriminant  analysis  discussed  in  this  section  was 
done  with  the  aid  of  computer  routines  appearing  in 
Eisenbeis  and  Avery  [2J  .  These  routines  use  a  complete 
stepwise  procedure  to  guarantee  the  best  discriminant 
function  possible  from  the  given  data.  This  procedure 
investigates  every  possible  combination  of  determining 
factors  beginning  with  a  subset  of  one  and  continuing  up  to 
m  factors  122-  Esr  each  subset  the  best  combination  of 
variables  is  chosen.  When  the  analysis  is  complete,  the 
analyst  is  free  to  choose  the  subset  with  the  best  function 
for  his  purposes.  In  this  study,  all  m  variables  were  used 
in  every  case. 

The  analysis  was  conducted  on  six  samples.  The  first 
was  at  the  all  Navy  level  where  eligible  members  were 
randomly  selected  without  regard  to  whether  they  were 
designated  strikers.  Mext,  the  ratings  in  each  of  the 
Mavy's  general  assignment  categories  (Seamen,  Firemen, 
Airmen,  and  Constructionmen)  were  subjected  to  cluster 
analysis  to  find  which  rates  tended  to  be  most  alike.  These 
clusters  were  formed  using  computer  routines  designed  by 
Anderberg  HI-  The  variables  used  to  determine  the 
clustering  were  the  average  individual  ASVAS  test  scores 
from  the  August  1978  cohort.  These  variables  were  chosen 
because  they  normally  remain  reasonaoly  constant  over  a 
member's  career.  The  averages  were  computed  by  summing  the 
test  scores  (separately  for  each  suotest)  for  every  E-4 
through  2-6  in  each  rate  and  dividing  by  the  number  in  the 
rate.  Three  of  the  clusters  obtained  from  the  Seamen 
category   were  subjected  to  discriminant  analysis.   The  last 


27 


two  sample  populations  to  be  anal/zed  were  individual 
ratings;  the  Boiler  Technicians  and  the  Machinist's  Mates. 
The  following  diagram  depicts  the  anal/ses  which  were  done. 
The  reader  may  find  it  helpful  to  refer  to  this  while 
reading    the    suceeiing    material. 


ALL  NAVY 

2  3 

GROUP  GROUP 


SN    CLSTR   1 

2  3 

GROUP      GROUP 


SN   CLSTR  2 

2  3 

GROUP      GROUP 


SN   CLSTR   4 

2  3 

GROUP      GROUP 


BOILER  TECH 

2  3 

GROUP        GROUP 


MACHINIST'S    MATE 

2  3 

GROUP  GROUP 


Discriminant    Analyses    Conducted    in    this    Study 


Figure    7. 


For  the  all  tfavy  sample,  300  members  of  each  group, 
i.e.,  test  passers,  non-test  passers,  and  out  of  service 
personnel  were  randomly  selected  from  the  eligible  pool. 
First,  tne  two  group  case,  test  passers  vs.  non-test  passers 
was        analyzed.  After      checking      the      varianoe-covariance 

matrices  and  finding  them  to  be  unegual,  a  comparison  of  the 
group  means  was  made.  The  test  for  eguality  of  group  means 
produced    an    F    statistic   of      3.83       which      rejected      the      null 


28 


hypothesis  at  the  0.0  1  Level  of  significance  (see  Appendix 
D)  .  The  variables  in  this  sample  which  provided  the 
greatest  power  of  discrimination  were  race  (24.5%),  high 
school  diploma  (20.3%),  and  dependents  (13.59%).  The 
analysis  indicated  Caucasian  members  had  a  higher 
probability  of  passing  the  exam  then  their  non-Caucasian 
counterparts.  Individuals  having  a  high  school  diploma 
and/or  having  dependents  also  were  more  likely  to  pass  the 
exam.  A  complete  list  of  the  determining  factors  and  the 
amount  of  discriminating  power  for  each  sample  population  is 
listed  in  Appendix  2. 

Once  the  discriminant  function  was  computed,  the 
classification  of  each  member  was  done  using  a  quadratic 
function  as  described  in  chapter  V.  The  rate  of  "hits"  or 
correct  classification  foe  the  all  Mavy  two  group  sample  was 
62%  (see  Appendix  H)  .  This  classification  rate  includes  the 
effects  of  the  a  priori  probability  of  group  membership  as 
do  all  other  classification  rates  in  this  study. 

The  three  group  case  for  the  all  !favy  sample  was  also 
investigated.  aece  the  equality  of  the  group  means  was 
rejected  at  the  3.0  1  level  of  significance.  In  this  case 
the  length  of  time  remaining  in  service  (2A0S) ,  32.0%,  and 
length  of  service,  12.27%,  proved  to  be  the  most  powerful 
determining  factors.  3oth  of  these  variables  discriminated 
best  between  the  out  of  service  group  and  the  other  two 
groups.  This  is  probably  because  a  member  with  one  year  or 
less  remaining  in  his  obligation  is  not  Likely  to  take  the 
exam  if  he  is  planning  to  leave  the  service,  because  the 
rewards  he  will  reap  are  not  worth  the  effort  to  prepare  for 
the  exam.  In  this  sample  the  rate  of  nits  was  63.0%,  which 
was  sligatly  better  than  the  two  group  case. 

Clustering  analysis  of  the  seamen  rates  resulted  in  four 
clusters   (see   Appendix  F)  .   The  ratings  tended  to  group  by 


29 


job  skills:  the  electronics  rates  in  one  cluster,  the 
adniinistrati ve  rates  in  another,  etc.  Three  of  the  clusters 
were  subjected  to  discriminant  analysis.  The  first  cluster 
investigated  was  cluster  four  in  Appendix  F.  The  test  for 
equality  of  group  means  generated  an  F  statistic  of  1.99 
which  was  significant  at  the  2.6X  level.  The  largest 
discriminating  variables  in  this  sample  were  years  of 
education  (21.77%),  race  (15.0%),  the  numerical  reasoning 
section  of  the  BTB  (13.7%)  ,  and  the  clerical  aptitude 
section  of  the  BTB  (10.46%).  For  this  sample  67%  of  the 
observations  were  correctly  classified. 

Cluster  four  was  also  subjected  to  the  three  group 
discriminant  analysis.   Sere  the  F  statistic   was   5.25   and 

-30 

was      significant      at      tha       10  level.       As    in    the    all    Navy 

group,  length  of  time  remaining  in  service  was  the  largest 
discriminator  (29%),  but  was  followed  by  the  BTB  test  for 
shop  aptitude  (12.09%)  and  time  in  service  (8.24%).  The 
three  group  sample  was  correctly  classified  59.0%  of  the 
time. 

The  Seamen  rating  cluster  two,  two  group  sample,  was 
analyzed  next.  The  test  for  equality  of  group  means  was 
significant  at  the  12%  level,  indicating  that  the  means  of 
the  groups  were  close  to  one  another.  Since  the  means  were 
so  close,  the         discriminating         variables        and        the 

classification    rates    should    be    viewed    with    care. 

The  three  group  case  of  the  Seamen  cluster  two  did  not 
test  as  having  equal  means  and  classified  63.  OS  of  the 
observations  correctly.  Even  so,  only  43%  of  the  non-test 
passers    were    correctly    classified. 

The  Seamen  cluster  one  consisted  mostly  of  slectronics 
ratings.       The   two   group    test      of      equality       of      group      means 


30 


produced  an  F  statistic  which  was  significant  at  the  45% 
level.  This  is  hard  to  accept  until  one  realizes  the  entry 
requirements  for  these  rates  are  very  specific.  Most 
entrants  have  a  high  school  diploma,  at  least  twelve  years 
of  school,  have  gone  to  A-school,  etc.  In  short,  the 
cluster  is  s c  homogeneous  in  terms  of  the  available  data 
that   it    was   not   possible   to    discriminate    between    the    groups. 

The  three  group  case  faired  somewhat  better  as  with  the 
other  clusters.  The  test  for  equality  of  the  means  was 
rejected  and  the  classification  was  correct  69.0%  of  the 
time.  However,  it  must  be  noted  that  only  38%  of  the  test 
passers    were   correctly    classified. 

The  test  of  group  means  for  the  Boiler  Technicians  two 
group  sample  resulted  in  an  ?  statistic  which  was 
significant  at  94%.  Because  of  this,  the  classification 
results  should  be  used  with  care.  Even  though  the  three 
group  analysis  showed  the  means  were  not  equal,  the 
percentage  of  correct  classifications  was  only  50.0%.  Again 
it  was  this  good  only  because  of  the  addition  of  the  out  of 
service  group.  The  distinction  between  the  test  passers  and 
non-test  passers  was  not  very  good,  with  only  23.0%  of  the 
non-test    passers   being   correctly   identified. 

The  Machinist's  Mate  rating  two  group  oase  produced 
surprising  results  when  the  test  of  the  equality  of  the 
group  means  produced  an  F  statistic  which  was  significant  at 
0.36%.  Considering  the  results  of  the  other  saaples,  one 
would  have  expected  the  significance  to  oe  much  larger.  The 
most  powerful  determining  factors  for  this  case  were  the  BT3 
tests  for  general  intelligence  (29.91%)  and  mechanical 
aptitude  (26.44%)  .  Of  the  total  sample,  66%  was  classified 
correctly. 

The    three    group    test    produced    an    F    statistic      which      was 


31 


-26 
significant   at  10    .   This  sample  showed  time  ramaining  in 

the  Navy  to  be  the  most  powerful  determining  factor  at 
18.05%.  Of  the  total  sample,  66*  was  correctly  identified 
using  the  decision  rules. 

From  the  above  results  it  can  be  seen  that  the  means  in 
most  of  the  two  group  samples  are  too  close  together  to 
permit  any  distinct  discrimination.  This  is  also  true  for 
the  three  group  samples  because,  although  the  out  of  service 
group  is  distinct  enough  to  cause  the  equality  of  group 
means  test  to  fail,  the  test  passer  and  non-test  passer 
groups  remain  as  indistinct  as  before.  The  fact  that  the 
all  Navy  sample  tested  as  not  having  equal  means,  while  all 
other  samples  did,  appears  to  be  due  to  the  variability  of 
the  data  in  the  all  Navy  case. 


32 


VII.        CONCLUSIONS 


Discriminant      analyses   and   cluster   analyses    were   used  in 

this   study   as   tools    to   attempt    to   determine    if   there      was  a 

difference      between   E-4    test    passers    and   non-test    passers  in 
terms   of    the    available   biographic    data. 

The  study  has  shown  with  a  high  level  of  certainty  that 
the  available  data  do  not  differentiate  sharply  between  the 
groups.  It  would  appear  that  the  entrance  requirements  for 
the  Navy  or  the  specific  rating  create  a  homogeneous  group 
of  personnel  in  terms  of  these  variables.  Since  there 
appears  to  be  nothing  in  the  general  biographic  background 
of  a  member  to  distinguish  between  the  groups,  other 
possible  explanitory  variables  might  be  inspected.  For 
instance,  where  was  a  member's  last;  duty  station,  what  type 
of  leaders  did  he  have,  was  he  training  in  a  rating  he 
liked? 

Although  the  discriminating  variables  were  not  found  in 
the  study,  the  use  of  the  E-4  exam  as  a  measure  of 
effectiveness  of  a  member's  utility  to  the  Navy  still  merits 
study.  sharper        discrimination        aignt      result      if      data 

reflecting   actual    test   takers   is   used. 


33 


APPENDIX  A 


LIST  OF  DETERMINING  FACTORS 


FACTOR 


TYPE 


RANGE 


1.   Dependants 


Binary 


0-No  dependents 
1  -  Dependents 


2.   Age 


Interval 


19-33 


3.   Race 


3inary 


0- Non-Caucasian 
1  -  Caucasian 


4.   Sex 


b.   A-5chool 


5.       years   of    education 


3inary 


Binary 


Interval 


0  -    Female 

1  -    Male 

0  -    HO    A-School 

1  -    A-School 

9-16 


7.       Education 

certification 


Binary 


0  -    No    high 
school    diploma 

1  -    diah    school 
diploma 


3.       3TB     (Gen.    Intel.) 


Interval 


31-74 


9.       BT3     (Mum.    Reason.) 


Interval 


32-69 


34 


10.    BTB     (tfech.    Apt.)  Interval 


37-70 


11.    BTB     (Clr.    Apt.) 


Interval 


38-76 


12.     BTB     (Shop    Apt.  ) 


Interval 


37-70. 


13.    Time    in    rate 


Interval 


6-59    Dos 


14.    Time    remaining 


Interval 


0-48    mos 


15.    Length   of    service  Interval 


12-72    bos 


35 


APPENDIX    B 


K    GROUP    TEST    FOR    EQUALITY    OF    GROUP    MEANS 


0  12  k 


This    implied  \i     -     m     =   0    for    1     Ji-i-JLJ-S* 

i  j 


Let      w(1)    =     u    -    M       =0 

u  (2)     =     W  /2    +    W    /2    -       U      =    0 
1  2  3 


aj(k-1)       =        u/(]c-1)     +       u/{k-1)     + -u  =    0 

1  k-1 


Now   a      :         1     Tu{i)    =   0 
0  X=T^ 


Shich    is    the    form    of    Anderson's    test   of    the    hypothesis       |5| 


a     :  £s.x.   =° 

0  ii 


Let    y        =        3  x  ♦    3    x  *....♦   S    x 

i  1   i,1  2  1,2  t.  i,k 


36 


then    y   =         3    x 
i    i 

and 


s  =    i     (y.-y)  (y.-y) ' 

H-T        x  1 

2  -1 

and   T      =    B  (y-    )  »S       (y-    ) 


if 


(Jf-2)  a       -^     JJ-m-1 


then   reject    the   equality   of    group    means. 


37 


APPENDIX  C 


K  GROUP  CLASSIFICATION  FUNCTION 


Assign  to  group  g  if  for  all  other  groups  h 


f(x)/f(x)   >tt/tt     b.  =  1 , ,  k  ,  h   ~   g 

g      h     —  h   g 


Assign    to   group    g    if    for    all    other    groups    h: 


-1  -1  -1  -1  -1  -1 

X '  (  £         -    E  )  X    -    2  (    y  '  E         -ii'E}X+u'Z        u        -u'Eu 

g  h  gg  hh  ggghhh 


-1 
>       lnfZ  .  E      j  -    21nCTT  /  irl 

h      g  h      g 


33 


APPENDIX  D 


TESTS  OF  EQUALITI  OF  GROUP  MEANS 


jBQqaLiiy  of  variance 

F  STATISTIC     LEVEL 


EQUALITY  OF  MEANS 
F  STATISTIC     LEVEL 


All  Navy        1.78 
(2  Group) 

All  Navy        2.34 
{3  Group) 

5N  Clstr  1      1.8  8 
(2  Group) 

5N  Clstr  1     2.07 
(3  Group) 

SN  Clstr  2      1.4  1 
(2  Group) 

SN  Clstr  2      1.75 
(3  Group) 

SN  Clstr  4      1.98 
(2  Group) 

SN  Clstr  4      2.06 
(3  Group) 

3T  2.0  8 

(2  Group) 


0.0 


0.0 


0.0 


0.0 


87 


0.0 


0.0 


0.0 


0.0 


3.  83 


16.69 


.99 


8.  15 


1.49 


3.  14 


1.99 


6.25 


.41 


0.0 


0.0 


45.  29 


0.0 


12.  36 


0.  0 


2.56 


0.  0 


94.70 


39 


BT 

(3 

Group) 

MH 

(2 

Group) 

aa 

(3 

Group) 

1.78 


1.90 


2.10 


0.0 


0.0 


0.0 


4.46 


2.61 


7.43 


0.0 


36 


0.0 


40 


APPENDIX    E 


POHER    OF    DETER MHISG    FACTORS 


2    GROUP 

Race 

Edu.    Cert. 

Dependents 

Age 

BTB    (Clr.    Apt.) 

BT3    (Nam.    Reason.) 

BTB     (flech.    Apt.) 

A-School 

BTB    (Gen    Intel.) 

Years   of    edu. 

BTB     (Shop    Apt.) 

Sex 


ALL 

iJAVY 

3    GROUP 

24.50% 

Time    reiaaining 

32.03% 

20.30% 

Time    in   service 

12.27% 

13.59% 

Dependents 

7.84% 

10.55% 

Sex 

7.45% 

a.  47% 

Edu.    Cart. 

6.99% 

5.58% 

Time    in    rate 

6.46% 

5.28% 

BTB     (Shop    Apt.) 

6.04% 

a. 59% 

BTB     (Mech.     Apt.) 

4.85% 

4.59% 

A-School 

3.98% 

1.57% 

BTB     (Clr.     Apt.) 

3.04% 

0.48% 

Race 

2.43% 

0.35% 

Age 

2.05% 

Years    of    edu.. 

1.67% 

BTB    (Gen.    Intel.) 

1.49% 

BT3    (Mam.     Seas.) 

1.35% 

41 


SEAMEN    GROUP     1 


2    GROUP 

BTB    (Shop    Apt.) 

29.43* 

3TB     (Clr.    Apt) 

17.88? 

BTB    (Num.    Reason.) 

14.32% 

BTB     (Gen.    Intel.) 

8.55% 

3TB    (Mech.    Apt.) 

6.58% 

Dependents 

6.07% 

A-School 

5.14% 

Age 

4.23% 

Hace 

3.  17% 

Eda.    Cert. 

3.  14% 

¥rs   of   edu. 

1.33% 

3    GROUP 

Time    remaining  35.72% 

BT3    (Gen.     Intel.)  10.95% 

BTB     (Shop    Apt.)  8.82% 

Age  8.19% 

Time    in    rate  6.21% 

3T3     (Clr.     Apt.)  6.17% 

BTB     (Mech.     Apt.)  4.89% 

Time    in    service  3.54% 

A-School  3.46% 

Dependents  3.45% 

Edu.    cert.  3.38% 

BTB    (Num.    Reason.)  2.13S 

Yrs    of    edu.  1.93% 

Race  1.03% 


42 


SEAMEN    GSOUP    2 


2    GROCJP 

A-School 

BTB     (Mech.     Apt.) 

flTB     (Clr.    Apt.) 

BTB     (Shop    Apt.) 

Yes    of    edu. 

Edu.    cert. 

Sex 

3TB     (Gen.    Intel.) 

BTB    (Num.    Season.) 

Age 

Dependents 

Sace 


3    GROUP 

17.30% 

Time    remaining 

37.68% 

14. 93* 

A-School 

6.76% 

13.35% 

Time    in    rate 

6.69% 

1 1 .  2  6  % 

Dependents 

6.  24% 

9.62* 

BTB     (Num.     Reason.) 

5.76% 

9.17% 

Edu.    cert. 

5.54% 

7.  14% 

Yr    of    edu. 

5.4  2% 

4.85% 

Time    in    service 

5.22% 

4.73% 

Race 

4.  52% 

2.81% 

BT3     (Shop    Apt.) 

3.99% 

2.40% 

BTB     (aech    Apt.) 

3.57% 

1.30% 

BTB     (Clr.     Apt.) 

3.49% 

BTB    (Gen.     Intel.) 

2.6  0% 

Sex 

1.39% 

Age 

1.05% 

43 


SEAMEN    GROUP    4 


2    GROUP 

Years   of    edu . 

Race 

BTB    (Num.    Reason.) 

BTB    (Clr.    Apt.) 

BTB     (Shop    Apt.) 

Sex 

Edu.    cert. 

Dependents 

3T3     (Gen.     Intel.  ) 

A-School 

Age 

BT3    (aeca.    Apt.) 


3    GROUP 

21.77% 

Time    remaining 

29.27% 

15.04% 

BTB     (Shop    Apt.) 

12.09% 

13.70% 

Time    in    service 

8.24% 

10.46% 

Dependents 

7.94% 

10.31% 

Edu.    cert. 

7.  14% 

6.  13% 

BTB     (Gen.     Intel.) 

6.67% 

6.01% 

A-School 

4.39:4 

4.52% 

Years    of    edu. 

4.15% 

3.41% 

BT3     (Num.     Reason.) 

4.06* 

3.32% 

Sex 

3.97% 

2.39% 

Race 

3.94% 

2.39% 

Time    in    rate 

2.58% 

BTB     {aeca    Apt.) 

1.96% 

BTB     (Clr.     Apt.) 

1.79% 

Age 

1.74% 

44 


BOILER    TECHNICIAN 


2    GROUP 

BTB    (Gen.    Intel.) 

BTB    {Nub.    Reason.) 

BTB    (Clr.    Apt.) 

Years    of    edu. 

Edu.    cert. 

A-School 

BTB     (Shop    Apr.) 

Dependents 

3T3     (aech.    Apt.) 

Age 

Race 


3  GROUP 

18.11* 

Time  remaining 

18.05* 

16.96* 

Years  of  edu. 

15.39* 

15.62* 

BTB  (Shop  Apt.) 

12.84* 

12.32* 

Time  in  service 

10.73* 

10.30* 

Age 

10.46* 

9.43* 

3T3  (Seek.  Apt.) 

10.25* 

6.99* 

Edu.  cert. 

7.4b* 

4.39* 

A-School 

3.23* 

4.35* 

BTB  (Gen.  Intel.) 

3.  13* 

1.42* 

Time  in  rate 

2.9  2* 

0.3  5* 

STB  (Num.  Reason.) 

2.19* 

Dependents 

1.39* 

Race 

1.30* 

3TB  (Clr.  Apt.) 

0.58S 

4  5 


MACHINIST'S    3ATE 


2    GROOP 


3    GROUP 


BTB    (Gen.    Intel.)  29.91% 

BTB     (Keen.    Apt.)  26.4  4* 

Age  10.01% 

BTB    (Nam.    Season.)  9.68% 
BTB     (Clr.    Apt.)  7.97% 

Dependents  5.15% 

Years   of    edu.  5.3  3% 

A-Scnool  2.41% 

Race  1.70% 

Edu.    cert.  1.53% 

BTB    (Shop    Apt.)  0.  12% 


Time  remaining  29.56% 

BTB  (fiech.  Apr.)  15.55% 

Edu.  Cert.  11.17% 

Time  in  service  3.30% 

BTB  (3an.  Intel.)  6.91% 

Time  in  rate  6.26% 

BTB  (Clr.  Apt.)  5.29% 

Age  4.6  5% 

Years  of  edu.  3.38% 

Race  3.05% 

BTB  (Shop  Apt.)  1.30% 

BTB  (Num.  Reason.)  1.65% 

A-School  1.62% 

Dependents  0.25% 


46 


APPENDIX  E 


RESULTS  OP  CLUSTERING  OP  SEAMEN  RATINGS 


CLUSTER  1  CLUSTER  2 

STS  -  Sonar  Tech  (S)  SK  -  Storekeeper 

STG  -  Sonar  Tech  (G)  PC  -  Postal  Clerk 

FTG  -  Fire  Cntl  rech  (gun)  IN  -  Yeoman 

FTH  -  Fire  Cntl  Tech  (missle)  DK  -  Dispursing  Clerk 

FTB  -  Fire  Cntl  Tech  (FBM)  HS  -  Mess  Specialist 

ETN  -  Electronics  Tech  (coram)  SB  -  Ships  serviceman 

ETR  -  Electronics  Tech  (radar)  LI  -  Lithographer 

EM  -  Elec  Warfare  Tech  CTA  -  Comm  Teen  (admin) 

CTM  -  Cola  Tech  (maint.)  TM  -  Tor pedoman ' s  Mate 

CTI  -  Comm  Tech  (interp. )  BH  -  Boatswain's  Mate 

DS  -  Data  Systems  Tech 

HI  -  Missle  Tech 


47 


CLUSTER    3  CLUSTER    4 

GMG    -      Gunner's  Mate     (gun)  CTT   -  Coma    Tech    (tech) 

GMH    -      Gunner's    Mate    (missies)       CTO    -  Comm    Tech.    (coam) 

GMT  -      Gunner's   Mate    (tech)  CTH    -  Coma    Tech    (collection) 

HN    -        aineman  DP  -  Data    Processing    Tech 

OS    -        Opticalman  IS   -  Intelligence    Spec 

Hk   -        Master    at    Arms  PN   -  Personnelaan 

13    -        Instr umentman  RM    -  Radioman 

DH    -  Draftsman 

QM    -  Quartermaster 

OS   -  Operations   Spec 

OT   -  Ocean    Sys   Spec 

au   -  Musician 

JO   -  Journalist 

LN    -  Legalman 


48 


APPENDIX  G 


CLASSIFICATION  RESULTS 


SAMPLE 


TOTAL 


I  CO 3HECTLY  CLASSIFIED 

TEST        NON  TEST        OOT  OF 
PASSERS      PASSERS         5ER7ICI 


All  Navy     62% 
(2  Group) 

All  Navy      63% 
(3  Group) 

SN  Clstr  1    74% 
(2  Group) 

SN  Clstr  1    69% 
(3  Group) 

SN  Clstr  2    63% 
(2  Group) 

5N  Clstr  2   63% 
(3  Group) 

SN  Clstr  4    67* 
(2  Group) 

SN  Clstr  4    69  7, 
(3  Group) 

ar         oos 

(2  Group) 


80S 


77* 


32% 


33% 


11% 


71* 


81  * 


80S 


36* 


45% 


39% 


88% 


191 


49 


43% 


50* 


42% 


34* 


7  1% 


72% 


76* 


77% 


49 


BT 

(3 

Group) 

aa 

(2 

Group) 

MM 

(3 

Group) 

57% 


66* 


66* 


30* 


7  1* 


S2* 


28* 


60S 


58* 


60* 


77* 


50 


APPENDIX  H 


CLUSTERING  TREE  FOR  SEAMEN  RATING 


1TE    G 


ITCM    NAME 

5?-!»-     TE'" H    S 

FI:E    CNTL    TECt-    G 

BNAR     TECH    G 

Ft-~  E    CMTL    TCH    M 

ELEC    WARFARE    TECH 

VI  S5LE    TECH 

CC"   T£ct-    w 

fLECrfiCMCS    TECh    N 

ELEC  TONICS    TECH    R 

USE    C'itl    TECh    B 

DATA     SIS    TECri 

Emm    TECH    I 

G:IN'!ER'  S    '•'  A 

i  I  NE  MA  N 

GJ'/JE-  'S    *ate    m 

GUNNER'  S    X()"E   T 

[m  <t;ump  NTMAM 

«l>rER    AT     ASMS 

CF'ICALMiM 

CO"M    TECH    T 

PES5C"N£LM4N 

CCVM    ~ECH    S 

CUS    o?T'-ESS    TECH 

INT  ELL    SPEC 

CT'M     TECH    C 

SIGMiL  1^N 

Iacicvan 

DRi? TSMAN 

Ct iRT  CRM4STER 

CFERi'IONS    SPEC 

IrEAN    SVS    TECH 

"•J  SI  CI  4*J 

Jli!JR\A  LMAN 

LEC-l"  v-'! 

S*  C*EKEE°tR 

POSTAL   C  LERK 

F-iwm     TFCH    A 

YE1"'  i'1 

DI  St  J°  SI  NO    CLERK 

ho    ,Trw  ,    j, |  I   3      MATE 

LI  *H  CG=APHEP 
MESS    S°EC 
SHIP'S    SE°V  ICEMAN 
'CPPEOC^AN*  S    *ME 

IT  EM    N  1M  E 


SEAMEN    RATINGS    3-78 
1         2         3 


ID 


MO. 

8 

14 
7 

15 
6 

17 

27 

19 

20 

16 

21 

30 

13 

18 

11 

l<i 

22 
2 

23 

25 

33 

29 

34 

38 

28 

4 

24 

43 

3 

5 

9 

44 

40 

32 

35 

41 

26 

31 

36 

1 

42 

37 

39 

10 


ID    NO. 


10       11       12       13       14       15 


1 

I 

I- 

I 

I 


I- 
I 


NOTE: 
FINAL    2 
JOIN  AT 


BRANCHES 
CLASS    25 


—  I 


I- 

1 


I- 


10      11       12      13      14       15 


51 


StAMEN    RATINGS    3-78 
THIS    'UN    DE°KTS    THE    PORTION    OF     THE    'REE    GENERATED    BETWEEN    ST^GE  1    AND    STAGE,  43 

"■"HE    CPITE?ICJ    VALUES    APE    SEGMENTED     INTO    THE    FOLLOWING   CLASSES. 

CLASS  LOWER    SO'JNO  UPPER    60J  NC 

1  0  .1-49  72687E-G3  0.  19742  124F-01 

2  0.197^2124E-0  1  0.39334521E-01 

3  0.  2<533<.52  1E-C1  C.  5  6926  91  8E-0  I 

4  0  .5892&918E-01  C.  7  i  519285E-01 

5  O.T8519285E-01  0  .9  8  11162  9  E-0  I 

6  0.  98111629E-G1  C.1177G3  9  7E    00 

7  0.1177C397E    00  0.12729632E     OC 

8  0.13729632E    00  0.156888S6E    00 

9  O.lffcEefooE    00  C.17e481ClE     OC 

10  0.17648101E    00  C.19607335E    OC 

11  C.19607335E    00  0.21566570E    00 

12  0.215665706     00  C.23525604E     OC 

13  0.23525304E    OJ  0.25485039E    00 

14  0.25st5C39E     00  C.27444273E    00 

15  0.27444<:73E    00  C.294035C3E     OC 

16  0.294Q3509E    00  0.31362742E    00 

17  C. 313627426     00  C.33321977E    00 
13  0.33321977E    00  0.35281211E     OC 

19  C. 352  SI  21  IE    00  0.3 7240446 E    0  0 

20  C.372-fG446£    00  0.3S19968CE     OC 

21  0.3C19C680E    00  0.41158915E    00 

22  C.41158915E    00  0.43118149E    00 

23  0.431131496     00  0.45077384E     OC 

24  0.45G7738hE    00  0.47036618E    00 

25  G.47C36cl8E     00  C.46995972E    00 


52 


LIST    OF    REFERENCES 


1.  Center  For  Naval  Analysis  Report  1086,  Success 
Chances  of  Recruits  Entering  the  Navj,  by  R.F.Lockman, 
Feburary    1977. 

2.  Eisenbeis,  R.A.  and  Avery,  3.3.  ,  Discri  minant 
Analysis  and  C las si  f  ication  Procedures,  p.  1, 
Lexingtcn   Books,       1972. 

3.  Gilbert,  Ethel  S. ,  "  On  Discrimination  Using 
Qualitative  Variables",  J°ii£Jl2Ll  Q.M.  £ki  American 
Statistical  Association,  v.  63,  p.  1399-1412, 
December    1963. 

4.  Anderberg,  a. 3.,  Cluster  Analysis  for  Applications, 
p.       131,      Academic  Press,       1973. 

5.  Anderson,  I.W.,  An  Introduction  to  Multivariate 
Statistical    Analysis,     p.       127,      Wiley,       1958. 

6.  Ward,  Jr.,  J. a.,  ■  Hierarchical  Grouping  to  Optimize 
an  Objective  Function",  Journal  of  the  American 
Statistical  Association,  v.  53,  p.  236-2  44,  June 
1963. 


53 


INITIAL  DISTRIBUTION  LIST 


1.  Defense  Documentation  Center 
Cameron  Station 
Alexandria,  Virginia  22314 

2.  Library,  Code  0142 
Naval  Postgraduate  School 
Monterey,  California  93940 

3.  Professor  3.  R.  Raai,  Code  55Re 
Department  of  Administrative  Sciences 
Naval  Postgraduate  School 
Monterey,  California  93940 

4.  Professor  3.  S.  Elstar,  Code  54Ea 
Department  of  Administrative  Sciences 
Naval  Postgraduate  School 
aonterey,  California  93940 

5.  dcno  (m,p,d 

Op  10 

Navy  Annex 

Washington,  D.C,    20370 

6.  Mr.  Joe  Silverman 

Navy  Personnel  Research  and  Development  Centar 
San  Diego,  California  92152 

7.  Principal  Deputy  Assistant  Secretary  of  The  Navy 
Room  4E730 

The  Pentagon 
Washington,  D.C.   20350 


No.  Copies 
2 


54 


3.      Center    for   Naval    Analysis 
(Attn:    Dr.    Eobert   Lockman) 
140  1    Wilson   Boulevard 
Arlington,    Virginia   22209 

9.      Lt    Edwin    F.     Beach 
PATRON    Four 
FPO,    San   Francisco,    California    96601 


55 


183377 

I  Thesis 

lB27  BeaCpredicf.on  of  Navy       vy 

E-4  test  passers.  ^ 


i 


JUL.  22  85  9  0  436,   ■ 

JUL  22  65  *  %  \%  1  o 

3     KAT   88 


Thesis 
B27  Beach 

c#l  prediction  of  Navy 

E-4  test  passers. 


183377 


100H5S 


lhesB27 

Prediction  of  Navy  E-4  test  passers 


3  2768  001  03454  9 

DUDLEY  KNOX  LIBRARY 


'"■  illl  111  H 

-■■-'■''.'.-■=■   ■■'■"::.---       :':-': 

IBB  I 

BllSP 


Wffil 


